1,229 research outputs found

    Improved variable selection with Forward-Lasso adaptive shrinkage

    Full text link
    Recently, considerable interest has focused on variable selection methods in regression situations where the number of predictors, pp, is large relative to the number of observations, nn. Two commonly applied variable selection approaches are the Lasso, which computes highly shrunk regression coefficients, and Forward Selection, which uses no shrinkage. We propose a new approach, "Forward-Lasso Adaptive SHrinkage" (FLASH), which includes the Lasso and Forward Selection as special cases, and can be used in both the linear regression and the Generalized Linear Model domains. As with the Lasso and Forward Selection, FLASH iteratively adds one variable to the model in a hierarchical fashion but, unlike these methods, at each step adjusts the level of shrinkage so as to optimize the selection of the next variable. We first present FLASH in the linear regression setting and show that it can be fitted using a variant of the computationally efficient LARS algorithm. Then, we extend FLASH to the GLM domain and demonstrate, through numerous simulations and real world data sets, as well as some theoretical analysis, that FLASH generally outperforms many competing approaches.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS375 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Functional linear regression that's interpretable

    Full text link
    Regression models to relate a scalar YY to a functional predictor X(t)X(t) are becoming increasingly common. Work in this area has concentrated on estimating a coefficient function, β(t)\beta(t), with YY related to X(t)X(t) through β(t)X(t)dt\int\beta(t)X(t) dt. Regions where β(t)0\beta(t)\ne0 correspond to places where there is a relationship between X(t)X(t) and YY. Alternatively, points where β(t)=0\beta(t)=0 indicate no relationship. Hence, for interpretation purposes, it is desirable for a regression procedure to be capable of producing estimates of β(t)\beta(t) that are exactly zero over regions with no apparent relationship and have simple structures over the remaining regions. Unfortunately, most fitting procedures result in an estimate for β(t)\beta(t) that is rarely exactly zero and has unnatural wiggles making the curve hard to interpret. In this article we introduce a new approach which uses variable selection ideas, applied to various derivatives of β(t)\beta(t), to produce estimates that are both interpretable, flexible and accurate. We call our method "Functional Linear Regression That's Interpretable" (FLiRTI) and demonstrate it on simulated and real-world data sets. In addition, non-asymptotic theoretical bounds on the estimation error are presented. The bounds provide strong theoretical motivation for our approach.Comment: Published in at http://dx.doi.org/10.1214/08-AOS641 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Curve alignment by moments

    Full text link
    A significant problem with most functional data analyses is that of misaligned curves. Without adjustment, even an analysis as simple as estimation of the mean will fail. One common method to synchronize a set of curves involves equating ``landmarks'' such as peaks or troughs. The landmarks method can work well but will fail if marker events can not be identified or are missing from some curves. An alternative approach, the ``continuous monotone registration'' method, works by transforming the curves so that they are as close as possible to a target function. This method can also perform well but is highly dependent on identifying an accurate target function. We develop an alignment method based on equating the ``moments'' of a given set of curves. These moments are intended to capture the locations of important features which may represent local behavior, such as maximums and minimums, or more global characteristics, such as the slope of the curve averaged over time. Our method works by equating the moments of the curves while also shrinking toward a common shape. This allows us to capture the advantages of both the landmark and continuous monotone registration approaches. The method is illustrated on several data sets and a simulation study is performed.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS127 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Sparse regulatory networks

    Full text link
    In many organisms the expression levels of each gene are controlled by the activation levels of known "Transcription Factors" (TF). A problem of considerable interest is that of estimating the "Transcription Regulation Networks" (TRN) relating the TFs and genes. While the expression levels of genes can be observed, the activation levels of the corresponding TFs are usually unknown, greatly increasing the difficulty of the problem. Based on previous experimental work, it is often the case that partial information about the TRN is available. For example, certain TFs may be known to regulate a given gene or in other cases a connection may be predicted with a certain probability. In general, the biology of the problem indicates there will be very few connections between TFs and genes. Several methods have been proposed for estimating TRNs. However, they all suffer from problems such as unrealistic assumptions about prior knowledge of the network structure or computational limitations. We propose a new approach that can directly utilize prior information about the network structure in conjunction with observed gene expression data to estimate the TRN. Our approach uses L1L_1 penalties on the network to ensure a sparse structure. This has the advantage of being computationally efficient as well as making many fewer assumptions about the network structure. We use our methodology to construct the TRN for E. coli and show that the estimate is biologically sensible and compares favorably with previous estimates.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS350 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Exercise redox biochemistry:conceptual, methodological and technical recommendations

    Get PDF
    Exercise redox biochemistry is of considerable interest owing to its translational value in health and disease. However, unaddressed conceptual, methodological and technical issues complicate attempts to unravel how exercise alters redox homeostasis in health and disease. Conceptual issues relate to misunderstandings that arise when the chemical heterogeneity of redox biology is disregarded which often complicate attempts to use redox-active compounds and assess redox signalling. Further, that oxidised macromolecule adduct levels reflect formation and repair is seldom considered. Methodological and technical issues relate to the use of out-dated assays and/or inappropriate sample preparation techniques that confound biochemical redox analysis. After considering each of the aforementioned issues, we outline how each issue can be resolved and provide a unifying set of recommendations. We specifically recommend that investigators: consider chemical heterogeneity, use redox-active compounds judiciously, abandon flawed assays, carefully prepare samples and assay buffers, consider repair/metabolism, use multiple biomarkers to assess oxidative damage and redox signalling

    Simple algorithm for the correction of MRI image artefacts due to random phase fluctuations

    Get PDF
    Grant support: This work was supported by EPSRC [grant numbers EP/E036775/1, EP/K020293/1] and received funding from the European Union's Horizon 2020 research and innovation programme [grant agreement No 668119, project “IDentIFY”]Peer reviewedPublisher PD

    Sleep-dependent consolidation in children with comprehension and vocabulary weaknesses: it'll be alright on the night?

    Get PDF
    BACKGROUND: Vocabulary is crucial for an array of life outcomes and is frequently impaired in developmental disorders. Notably, 'poor comprehenders' (children with reading comprehension deficits but intact word reading) often have vocabulary deficits, but underlying mechanisms remain unclear. Prior research suggests intact encoding but difficulties consolidating new word knowledge. We test the hypothesis that poor comprehenders' sleep-associated vocabulary consolidation is compromised by their impoverished lexical-semantic knowledge. METHODS: Memory for new words was tracked across wake and sleep to assess encoding and consolidation in 8-to-12-year-old good and poor comprehenders. Each child participated in two sets of sessions, one beginning in the morning (AM-encoding) and the other in the evening (PM-encoding). In each case, they were taught 12 words and were trained on a spatial memory task. Memory was assessed immediately, 12- and 24-hr later via stem-completion, picture-naming, and definition tasks to probe different aspects of word knowledge. Long-term retention was assessed 1-2 months later. RESULTS: Recall of word-forms improved over sleep and postsleep wake, as measured in both stem-completion and picture-naming tasks. Counter to hypotheses, deficits for poor comprehenders were not observed in consolidation but instead were seen across measures and throughout testing, suggesting a deficit from encoding. Variability in vocabulary knowledge across the whole sample predicted sleep-associated consolidation, but only when words were learned early in the day and not when sleep followed soon after learning. CONCLUSIONS: Poor comprehenders showed weaker memory for new words than good comprehenders, but sleep-associated consolidation benefits were comparable between groups. Sleeping soon after learning had long-lasting benefits for memory and may be especially beneficial for children with weaker vocabulary. These results provide new insights into the breadth of poor comprehenders' vocabulary weaknesses, and ways in which learning might be better timed to remediate vocabulary difficulties
    corecore